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Abstract 

Random linear network coding can be used in peer-to-peer networks to increase the efficiency 
of content distribution and distributed storage. However, these systems are particularly susceptible to 
Byzantine attacks. We quantify the impact of Byzantine attacks on the coded system by evaluating the 
probability that a receiver node fails to correctly recover a file. We show that even for a small probability 
of attack, the system fails with overwhelming probability. We then propose a novel signature scheme that 
allows packet-level Byzantine detection. This scheme allows one-hop containment of the contamination, 
and saves bandwidth by allowing nodes to detect and drop the contaminated packets. We compare the net 
cost of our signature scheme with various other Byzantine schemes, and show that when the probability 
of Byzantine attacks is high, our scheme is the most bandwidth efficient. 
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I. Introduction 

Network coding [1], an alternative to the traditional forwarding paradigm, allows algebraic 
mixing of packets in a network. It maximizes throughput for multicast transmissions [2], [3], 
[4], as well as robustness against failures [5] and erasures [6]. Random linear network coding 
(RLNC), in which nodes independently take random linear combination of the packets, is 
sufficient for multicast networks [7], and is suitable for dynamic and unstable networks, such as 
peer-to-peer (P2P) networks [8], [9]. 

A P2P network is a cooperative network in which storage and bandwidth resources are shared 
in a distributed architecture. This is a cost-effective and scalable way to distribute content to a 
large number of receivers. One such architecture is the BitTorrent system [10], which splits large 
files into small blocks. After a node downloads a block, it acts as a source for that particular 
block. The main challenges in these systems are the scheduling and management of rare blocks. 

As an alternative to current strategies for these challenges, [8], [9] propose the use of RLNC to 
increase the efficiency of content distribution in a P2P solution. These schemes are completely 
distributed and eliminate the need of a scheduler, since each node independently forwards a 
random linear combination. In addition, there is a high probability that each packet a node 
receives is linearly independent of the previous ones, and thus, the problem of redundancy 
caused by the flooding approaches in traditional P2P networks is reduced. RLNC based schemes 
significantly reduce the downloading time and improve the robustness of the system [8], [11]. 

Despite their desirable properties, network coded P2P systems are particularly susceptible to 
Byzantine attacks [12], [13], [14] - the injection of corrupted packets into the information flow. 
Since network coding relies on mixing of packets, a single corrupted packet may easily corrupt 
the entire information flow [15], [16]. Furthermore, in P2P networks, there is typically no security 
control over the nodes that join the network and the packets that they redistribute. The topologies 
of the overlay graphs that arise from traditional P2P networks are often modeled as scale-free 
and small-world networks [17], [18], which are prone to the dissemination of epidemics, such 
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as worms and viruses [19], [20]. Several authors address these problems in coded P2P networks. 
We shall discuss these countermeasures in Section [III Most of these can be divided into two 
main categories: (i) end-to-end error correction and (ii) misbehavior detection. 

Motivated by these observations, we address the issues of Byzantine adversaries in coded P2P 
networks. The main contributions of this paper are as follows: 

• We propose a model for the evaluation of the impact of Byzantine attacks in coded P2P 
networks, and provide analytical results which show that, even for a small probability of 
attack, the information can become contaminated with overwhelming probability. 

• We propose a new efficient, packet-based signature scheme, designed specifically for RLNC 
systems, to detect Byzantine attacks by checking the membership of a received packet in 
the valid vector space. This scheme allows an one-hop containment of the contamination. 

• We analyze the overhead in terms of bandwidth associated with our signature scheme, and 
compare it to that of various Byzantine detection schemes. We also show that our scheme 
is the most bandwidth efficient if the probability of attack is high. 

This paper is organized as follows. Section fll] gives an overview of network coding in P2P 
networks and existing Byzantine detection schemes. In Section [nil we analyze the impact of 
Byzantine attacks on the system. We propose our signature scheme in Section [TV] and compare 
its overhead with other schemes in Section [V] Finally, we conclude in Section |VIJ 

II. Background 

A. Network coding in P2P networks 

References [6], [7] propose a random block linear network coding system - a simple, practical 
capacity-achieving code, in which every node independently constructs its linear code randomly. 
In such a system, a source generates information in batches of G packets (called a generation). 
The source then multicasts them to its destination nodes using RLNC, where only the packets 
from the same generation are mixed. Note that RLNC is a distributed protocol, which requires 
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no state information; thus, making it suitable for dynamic and unstable networks where state 
information may change rapidly or may be hard to obtain. 

Several authors have evaluated the performance of network coding in P2P networks. Gkantsidis 
et al. [9] propose a scheme for content distribution of large files in which nodes make forwarding 
decisions solely based on local information. This scheme improves the expected file download 
time and the robustness of the system. Reference [8] compares the performance of network coding 
with traditional coding measures in a distributed storage setting with very limited storage space 
with the goal of minimizing the number of storage locations a file-downloader connects to. They 
show that RLNC performs well without the need for a large amount of additional storage space. 
Dimakis et al [21] introduce a graph-theoretic framework for P2P distributed system, and show 
that RLNC minimizes the required bandwidth to maintain the distributed storage architectures. 

B. Byzantine detection scheme for network coded systems 

1 ) End-to-end error correction scheme: Reference [22] introduces network error correction 
for coded systems. They bound the maximum achievable rate in an adversarial setting, and 
generalize the Hamming, Gilbert- Varshamov, and Singleton bounds. Jaggi et al. [15] introduce 
the first distributed polynomial-time rate-optimal network codes that work in the presence of 
Byzantine nodes and are information-theoretically secure. The adversarial nodes are viewed as 
a secondary source. The source adds redundancy to help the receivers distill out the source 
information from the received mixtures. This work is generalized in [23], [24]. 

2) Generation-based Byzantine detection scheme: Ho et al. [25] introduce an information- 
theoretic approach for detecting Byzantine adversaries, which only assumes that the adversary 
did not see all linear combinations received by the receivers. Their detection probability varies 
with the length of the hash, field size, and the amount of information unknown to the adversary. 
A polynomial hash is added to each packet in the generation. Once the destination node receives 
enough packets to decode a generation, it can probabilistically detect errors. The intuition behind 
this scheme is that if a packet is valid, then its data and hash are consistent with its coding vector; 
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and a linear combination of valid packets is also valid. 

3) Packet-based Byzantine detection scheme: There are several signature schemes that have 
been presented in the literature. For instance, [8], [26], [27] use homomorphic hash functions 
to detect contaminated packets. Reference [16] suggests the use of a Secure Random Checksum 
(SRC) which requires less computation than the homomorphic hash function, but requires a 
secure channel to transmit the SRCs. In addition, [28] proposes a signature scheme for network 
coding based on Weil pairing on elliptic curves. 

III. Impact of Byzantine attacks on P2P networks 

In this section, we first introduce our model for evaluating the probability of a distributed 
denial of service attack (DDoS) caused by Byzantine nodes in a P2P network. We then present 
results for two distinct scenarios. 

A. Model 

We consider a directed graph with a set of nodes TV. A source node has a large file to be sent 
to receiver nodes. The file is divided into m packets. To do so, the source connects to a subset 
of nodes, N s Q A/", chosen uniformly at random, and sends each of them a different random 
linear combination of the original file packets. To ensure that enough degrees of freedom exist 
in the network, |jV s | > m. We refer to the nodes in M B as levels nodes. A tracker node keeps 
track of the list of informed nodes, N(t), i.e., nodes that keep an information packet. 

For a receiver to retrieve the file, it connects to a subset of nodes J\f r C J\f, chosen uniformly 
at random, with \J\f r \ > |jV s |. We refer to the nodes in Af r as level-r nodes. Note that there may 
be an overlap between level-s and level-r. In each time slot, one of the uninformed level-r nodes, 
n E J\f r \J\f s , contacts the tracker to retrieve a random list of d informed nodes, where d < \M S \. 
The node n then connects to these informed nodes through a secure overlay connection, retrieves 
their packets, and stores a single random linear combination of these packets. During the same 
time slot, the tracker updates its list of informed nodes to N(t) U {n}. This process is repeated 
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Fig. 1. Network model. The source is connected to the level-s nodes, and the receiver is connected to the level-r nodes. The 
dark nodes are the informed nodes. The level-r nodes take turns to contact the tracker, and connects to \D\ = 2 level-s nodes 
based on the list returned by the tracker. Here, nodes n r i and n r 2 has completed this process, and the other level-r nodes have 
not. 

for all nodes in M r \M B , and then all level-r nodes forward their stored packets to the receiver. In 
order to maximize the probability of storing linearly independent combinations in level-r nodes 
and ensure decodability at the receiver, we set d > 2. Although we assume that each node in 
level-s and level-r stores only one packet, the model can be easily generalized to account for 
higher numbers. An example of this network model is shown in Figure CD Note that the tracker 
is considered to be a trusted party in our model - in fact, as in the case of most P2P protocols, 
a dishonest tracker would yield a protocol failure with overwhelming probability. 

We define an Information Contact Graph G(t) = {N(t), A(t)} to denote the evolving graph 
formed in the above process, where N(t) is the list of informed nodes and A{t) is the set of 
overlay links that connect the level-s and level-r nodes. The probability that a node becomes 
a Byzantine attacker is p b . An attacker corrupts the packet it stores by generating arbitrary 
content while complying to the standard packet format. A node independently decides whether 
it becomes Byzantine at the start of the file dissemination process according to p b and stays 



that way throughout the process. We define an indicator variable i&(n) which is 1 if node n is 
Byzantine and otherwise. The tracker has no information about which nodes are Byzantine. 
A contaminated packet is a packet that is either directly corrupted by an attacker, or is a linear 
combination that involves at least one contaminated packet. A contaminated node is a node 
that stores a contaminated packet. The blocking probability \I/ is the probability that the receiver 
collects at least one contaminated packet, and thus, is unable to decode the file. This is equivalent 
to the probability that the attacker successfully carries out a DDoS attack. 

B. Analysis of Impact of Byzantine Attacks 

We now evaluate the blocking probability at the receiver. We then consider the expected number 
of contaminated nodes at any given time. First, we introduce necessary definitions, as follows. 
We define an indicator variable I c (t,n) which is equal to 1 if node n is contaminated at time 
t and otherwise. C(t) is a random variable for the number of contaminated nodes in N(t), 
and C(t) = \N(t)\ — C(t) is the number of uncontaminated nodes. The function h(k; N,m,n) 
denotes the hypergeometric distribution, in which 

h(k;N,m,n)= (™\( N ~ ™\ / ( N 

Let Nb denote the number of informed Byzantine nodes at time t — 0, that is, the number of 
Byzantine nodes in J\f s . Nb has a binomial distribution with parameters (\Af s \,pb). 

We consider two scenarios. In Theorem\Jl for simplicity, we consider a static informed nodes 
list, in which the list kept by the tracker is fixed to J\f s . In this case, level-r nodes only connect 
to level-s nodes. Second, in Theorem |H we generalize to the case in which the tracker updates 
its list of informed nodes to N(t), as stated in Section UlI-AI 

Theorem I (Static Informed Nodes List): Let G(t) be an information contact graph in which 
nodes in J\f r only connect to nodes in J\f s . Then its blocking probability ^ is given by: 

* = 1 W, \K\) ( X; ( l ^ s] )pi(l-p b ) lArsH f(i,y) 

y=0 \ i=0 ^ ' 



s 



where 



f(i,y) 



\M~r\-y 



l- Pb )h(0, \Af s \,i,d) 

Proof: We consider two disjoint subsets of J\f r : the set of informed nodes at t = 0, that is, 
jV r njV s , and the uninformed nodes, that is, J\f r \J\f s . Let F be a random variable for the number 
of nodes in Af r fl AQ. Y has a hypergeometric distribution, P{Y = y) = h(y; \Af\, \Af s \, l-M-l). 

We first consider rt G 7V r fl A/" s . Given N b = i and F = the probability that n is 
uncontaminated is equal to the probability that it is not initially Byzantine, which is equal 
to 1 — i/\N 8 \. Then, the probability that all nodes in N r C\N 8 are uncontaminated is: 

P(I c (n, 0) = 0, Vn G A/" r n jV fl |iV b = l,Y = y)=(l- " . 

Now, at each timeslot t > 0, a node n G J\f r \ftf s becomes informed. For n to be uncontami- 
nated, it must not be Byzantine and it must connect to d uncontaminated nodes. Then, 

P(I c (n,t) =0\N b = i,Y = y) = {1 - p b )h(0, \N s \,i,d). 
It follows that the probability that all nodes in J\f r \J\f s are uncontaminated at time t is: 

P(I c (n,t) = 0,VneM\M\N b = i,Y = y) = ( (l-p b )h(0,\K\,i,d)] , for < t < \K\~V- 



Note that since |jv^.\jV s | nodes are added, the information dissemination process ends at 
t = \J\f r | —y. Now, the probability that only uncontaminated nodes exist in J\f r at time t = \J\f r \—y, 
conditioned on F = y and N b = i, is: 



f(i,y) 



i 



p b )h(0, \M\,i,d) 



\M~r\-y 



N b has a binomial distribution, F has a hypergeometric distribution and they are indepen- 
dent of each other. Taking out these two conditions, the probability that all nodes in J\f r are 
uncontaminated is: 



7 = $>(y; \Af\,\K\,\K\) E 

y=0 \ i=0 



\K 



K(i - Pb. 



Ws 



l f(hy) 



It follows that the blocking probability is \& = 1 — 7. 
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where / • \s 



We now consider that the list of informed nodes at the tracker is N(t), that is, it is updated 
with each new informed level-r node. 

Theorem 2 (Evolving informed nodes list): Let G(t) be an information contact graph in which 
\Af r \J\f s \ are to be added to the graph by connecting to nodes in N(t). Then its blocking 
probability \l/ is: 

W f m /\\f\\ 

* = i - J2Kv; \W, W, \K\) k(i - ^'"V&v) 

\Mr\-y 

II (l-Pb)h(0;\Af s \+t-l,i,d) 
t=i 

Proof: Recall from TTzeorem [7] that we consider two disjoint subsets of J\f r , that is, M r nA/" s 
and J\f r \N s . As before, Y is the number of nodes in N r nJ\f s . Again, at time t — 0, the probability 
that all nodes in J\f r H jV s are uncontaminated given N b = i and Y — y is (1 — i/|A/" s |) 3/ . 

We now consider the nodes in J\f r \J\f s and assume Nb = i,Y = y. At each time step, there 
are C(t) contaminated nodes and C{t) = \J\f s \ + t — C(t) uncontaminated nodes in N(t). The 
probability of obtaining a contaminated node at time t+1 is only dependent on C(t) and C(t), 
and thus, we can model these probabilities by Markov chains E\Nb,Y = {S, P}, in which S 
represents the set of states and P represents the matrix of transition probabilities. A state in S is 
represented by s = (C(t), C(t)). Transitions from s are only possible to s' = (C(t) + 1, C(t)) and 
to s" = (C(t), C(t) + 1). It is also important to note that the depth of the Markov chain is equal to 
\M r \M s \ = \M r \ —y. The transition probabilities from s when adding a node n are P(s — > s') = 
P(I c (t+l,n) = l\C(t),C(t),N b ,Y) and P(s -> s") = P(I c (t + 1, n) = 0\C(t), C(t), N b , Y). 
E\N b ,Y is illustrated in Figure \2\ for |jV r \jV s | = 2. 

Let us denote C(t) as x and t' = |A/" S | + 1, it follows that C(t) — t' — x. Now let p^-, denote 
the probability of being in state s at time t. pi-, = p\ xt _ x j can be defined recursively as: 
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0(0] 
C(0) 



P(I c (n,2)\C(l),C(l)) 
P{I c {n,1)\G{V),G[l]) 



C(0) + 1 
C(0) 



C(0) 



C(0) + 2 
C(0) 



C(0) + 1 
£3(0) + 1 



C(0) 
C(0) + 2 



Fig. 2. Markov diagram for the dissemination process, \N r \ — Y = 2. The transitions to the left (dotted arrows) represent the 
addition of an uncontaminated node, and the transitions to the right (filled arrows) represent the addition of a contaminated 
node. The grey states are considered in computing \&, that is, the states in which no contaminated nodes are added. 

P\x,t'-x} = P{x\t'-z}P({z-l,t'-2}^{z,t'-x}) +Pfa t ,_ x - 1 }P{{x,t'-x-l}-*{x,t'-x}), 

p {{x j>- x }-,{x+i,t'-x}) = 1 - P(I c (t, n) = 0\x, f - x,N b = i,Y = y), 

< 

p {{x j>-x}-,{x,t>-x+i}) = P(I c (t, n) = 0\x, t' - x,N b = i,Y = y), 

n° - 1 

Now, consider that node n is active at time t. The probability of n being uncontaminated is 
the probability that it is not Byzantine and does not connect to contaminated nodes. Thus, 

P(/ C (t, n) = U\C{t -l),C(t-l) 7 N b = i,Y = y) = (1 - p b )h(0; \M S \ + 1 - 1, C(t - 1), d). 

Now, notice that the probability of only having uncontaminated nodes at time t = \J\f r \—y is the 
probability of, starting in state (C(0), C(0)) = (i, \M S \ —i), ending in state (z, \Af s \ —i+\Af r \ —y) 
after |A^| — y steps: in that case, no contaminated node is added to the network. The probability 

of this event, conditioned on N b = i and Y = y, is 

\Afr\-y IM-l-j/ 

II P(Ic(t,n) = 0\C(t-l),C(t-l),N b = i,Y = y)= JJ (1 -p 6 )/i(0; |jV s | + 1 - 1, z, d). 
t=i t=i 

Combining the results for sets J\f r PI N a and J\f r \J\f s , we have that the probability that no 
contaminated nodes exist in J\f r given that N b = i and Y = y is given by 



1 1 




y r WA-y 



d) 



Finally, it follows that the blocking probability at time \J\f r \J\f s \ is 




\Afa\-i 



f(i,y) 



The results from Theorems 1 and 2 are illustrated in Figure \3\ Note that even for a small p b , 
the blocking probability \l/ is very high. Even for the case in Theorem Q] \1> grows exponentially. 
This is because it is sufficient for a single level-r node to connect to a Byzantine node in level-s 
to contaminate the receiver. Figure [3] indicates that ^ grows faster for the evolving informed 
node list than for the static informed node list). This is due to the fact that as more nodes 
are added to the network, the presence of contaminated nodes becomes more likely, and thus, 
the probability that a level-r node connects to at least one contaminated node increases. The 
probability \1> also increases with other parameters such as d, \M S \, and \M r \ since they increase 
the probability of level-r nodes connecting to contaminated nodes. 

From the above proofs, it follows that the number of contaminated nodes in N(t),t > 0, is 
dependent on the random variable Y = \J\f r fl J\f s \ . We now perform an analysis of the expected 
number of contaminated nodes in the network E[C(t)] conditioned on Y = y. 

First, we consider the case of the static informed nodes list, conditioned on N b = i, Y = y. 
It is clear that E[C(0)\N b — i] — i. Now, at each time step t, one contaminated node is added 
to N(t) with probability 1 - P(I c (n, t) = 0\N b = i,Y = y) and thus E[C(t)\N b = i,Y = y] = 
i + t(l- (1 -p b )h(0; \J\f s \,i,d)). It follows that 



In the case of the evolving informed nodes list, since the states of E\N b , Y are representative 
of the number of contaminated nodes in the network, E[C(t)\N b , Y] has a direct correspondence 
to the expected state the Markov Chain is in after t time steps; therefore: 



E[C{t)\Y = y] = J2( l Sl k(l -p 6 ) |A/iH (i + t(l - (1- Pb )h(0; \M s \,i,d)) 
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Byzantine probability p b 



Fig. 3. Blocking probability in function of pt, for |jV| = 30, |As| = 5, \J\f r \ = 6 and d = 3. The results for the static and 
evolving informed nodes list are shown in full and dashed, respectively. 



E[C(t)\Y = y] = £ ('f 'W -Pt) W ^(f^xp\ XtV _A 

i=0 ^ ' ^ x=i ' 

In order to visualize these results, we take the expected value of Y for the set of parameters 
chosen in Figure\3\ which is equal to 1. Then, we plot E[C(t)\Y = 1] for the static and evolving 
informed node lists. It is shown in Figure that the expected number of contaminated nodes in 
the static case is linear with time. For small probabilities p^, the E[C(t)\Y = 1] is higher for 
the evolving case; as p b increases, the values for both cases become similar. 



IV. Signature scheme for Byzantine detection 

From the previous Section, we can see that coded P2P networks are highly vulnerable to 
Byzantine attacks, and the contamination can quickly spread throughout the network. Although 



13 




i — L r\ L 4 m ja i , 

>< 1 2 3 4 5 

Timestep i 



Fig. 4. Expected number of contaminated nodes in function of time, for \J\f\ — 30, |7V S | = 5, \Af r \ = 6, d — 3 and Y = 1, The 
results for the static and evolving informed nodes list are shown in full and dashed, respectively. 



we only consider a particular network model in Section [TIT] for the purpose of analysis, such 
problems exist in all network coded systems. Therefore, it is desirable to have a signature 
scheme that checks the validity of each received packet without decoding the whole file. Then 
the contamination can be contained in one-hop, and we can avoid the decoding delay. In uncoded 
systems, the source knows all the packets being transmitted in the network, and therefore, can 
sign each one of them. However, in a coded system, each node produces "new" packets, and 
standard digital signature schemes do not apply. Previous work that attempts to solve this problem 
is based on homomorphic hash functions [8], [26], [27], Secure Random Checkup [16], or Weil 
pairing on elliptic curves [28]. In this section, we introduce a novel signature scheme for the 
coded system based on the Discrete Logarithm problem. 

We consider a directed graph with a set of nodes TV. A source node has a large file to be 
sent to receiver nodes. The file is divided into m packets. A node in the network receives linear 
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combinations of the packets from the source or from other nodes. In this framework, a node is 
also a server to packets it has downloaded, and always sends out random linear combinations 
of all the packets it has obtained so far to other nodes. When a receiver has received m linearly 
independent packets, it can re-construct the whole file. We denote the m original packets as 
Vi, v m , and view them as elements in /-dimensional vector space ¥ l p , where p is a prime. The 
source node adds coding vectors to create Vi, v m , v* = (0, 1, ...,0, ...,vu), where the 
first m elements are zero except the zth element which is 1, and e ¥ p is the jth element in 
Vj. A packet w received by a node is a linear combination of these vectors, 

m 
i=l 

where (/3x, ...,/? m ) is the global coding vector. 

The key observation for our signature scheme is that the vectors Vi, v m span a subspace 

V of ~¥™ +l , and a received vector w is a valid linear combination of vectors Vi, v m if and 
only if it belongs to V. Our scheme is based on standard modulo arithmetic (in particular the 
hardness of the Discrete Logarithm problem) and on an invariant signature for the linear span 
V. Each node verifies the integrity of a received vector w by checking the membership of w in 

V based on the signature. 

Our signature scheme is defined by the following ingredients: 

• q: a large prime number such that p is a divisor of q — 1. Note that standard techniques, 
such as that used in Digital Signature Algorithm (DSA) [29], apply to find such q. 

• g: a generator of the group G of order p in ¥ q . Since the order of the multiplicative group 
F* is q — 1 (a multiple of p), we can always find a subgroup, G, with order p in F*. 

• Private key: K s = {aj}i=i ) ... 1 m+«> a random set of elements in F* only known to the source. 

• Public key: K p = {hi = g ai }i=i,..., m +i, signed by some standard signature scheme, e.g., 
DSA, and published by the source. 

To distribute a file in a secure manner, the signature scheme works as follows. 
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1) Using the vectors Vi, v m from the file, the source finds a vector u = (ui, ...,u m+ i) G 
W™ +1 orthogonal to all vectors in V. Specifically, the source finds a non-zero solution, u, 
to the set of equations Vj • u = for i — 1, m. 

2) The source computes the vector x = (ui/ai, u 2 /a2, u m+ i/a m+ i). 

3) The source signs x with some standard signature scheme and publishes x. We refer to the 
vector x as the signature of the file being distributed. 

4) The client node verifies that x is signed by the source. 

5) When a node receives a vector w and wants to verify that w is in V, it computes 

m+l 
i=l 

and verifies that d — 1. 
To see that d is equal to 1 for any valid w, we have 

m+l m+l m+l 

d = Y[ h XiWi = Y^(g ai ) UiW ^ ai = Y[ 9 UiWi — g^=^ UiW ^ = 1, 

i=l i=l i=l 

where the last equality comes from the fact that u is orthogonal to all vectors in V. 

Next, we show that the system described above is secure. In essence, the theorem below shows 
that given a set of vectors that satisfy the signature verification criterion, it is provably as hard 
as the Discrete Logarithm problem to find new vectors that also satisfy the verification criterion 
other than those that are in the linear span of the vectors already known. 

Definition 1: Let p be a prime number and G be a multiplicative cyclic group of order p. Let 
k and n be two integers such that k < n, and T = {hi, h n } be a set of generators of G. Given 
a linear subspace, V, of rank k in F" such that for every v G V, the equality T v = Yl7=i K* = 1 
holds, we define the (p, k, ?i)-Diffie-Hellman problem as the problem of finding a vector w G 
with T w = 1 but w i V. 

By this definition, the problem of finding an invalid vector that satisfies our signature verifi- 
cation criterion is a (p, m, m + /)-Diffie-Hellman problem. Note that in general, the (p, n — 1, n)- 
Diffie-Hellman problem has no solution. This is because if V has rank n — 1 and a w' exists 
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such that r w ' = 1 and w' V, then w' + V spans the whole space, and any vector w £ F™ 
would satisfy T w = 1. This is clearly not true, therefore, no such w' exists. 

Theorem 3: For any k < n — 1, the (p, k, n)-Diffie-Hellman problem is as hard as the Discrete 
Logarithm problem. 

Proof: Assume there exists an efficient algorithm to solve the (p, k, n)-Diffie-Hellman 
problem, and we wish to compute the discrete logarithm log g (z) for some z = g x , where g is a 
generator of a cyclic group G with order p. We can choose two random vectors r = (r 1; r n ) 
and s = (si, s n ) in F™ and construct T = {hi, h n }, where hi = z Ti g Si for % = 1, n. We 
then find fc linearly independent (and otherwise random) solutions Vi, to the equations 

v ■ r = and v ■ s = 0. 

Note that there exist n — 2 linearly independent vector solutions to the above equations. Let V 
be the linear span of {v!,...,v fc }, then any vector v G V satisfies T v = 1. Now, if we have 
an algorithm for the (p, k, n)-Diffie-Hellman problem, we can find a vector w ^ V such that 
r w = 1. This vector would satisfy w • (xr + s) = 0. Since r is statistically independent from 
(xr + s), with probability greater than 1 — 1/p, we have w ■ r ^ 0. In this case, we can compute 

i ( \ w ' s 

lo g 9 U) = x= . 

y w ■ r 

This means the ability to solve the (p, k, ra)-Diffie-Hellman problem implies the ability to solve 
the Discrete Logarithm problem. ■ 

This proof is an adaptation of a proof in an earlier publication by Boneh et. al [30]. 

Our signature scheme makes use of the linearity property of RLNC, and enables the nodes to 
check the integrity of packets without a secure channel, unlike the homomorphic hash function 
or SRC schemes [16], [26]. In addition, our scheme does not require the nodes to decode coded 
packets to check their validity - thus, is efficient in terms of delay. The computation involved 
in the signature generation and verification processes is very simple. Furthermore, our scheme 
uses the Discrete Logarithm problem, which is more standardized and widely used, compared 
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to the recently developed Weil pairing problem used in [28]. Lastly, we note that our signature 
scheme is rateless, which is not the case in end-to-end or generation based detection schemes. 

V. Overhead analysis 

In the previous Sections, we showed that our signature scheme is beneficial, as even a small 
amount of attack can have a devastating effect in coded networks. However, we have not shown 
that this scheme is efficient in terms of bandwidth (i.e. overhead of augmenting the signature 
scheme), and indeed, it is not always the case that our signature scheme is desirable. We now 
study the cost and benefit of the following three Byzantine schemes: 1) our signature scheme 
proposed in Section [TV] 2) end-to-end error correction scheme [15], and 3) generation-based 
Byzantine detection scheme [6]. If we implement Byzantine detection schemes, we can detect 
contaminated data, drop them, and therefore, only transmit valid data; however, this benefit comes 
with the overhead of the schemes in the forms of hashes and signatures. It is important to note 
that, for the dropped data, the receivers perform erasure correction, which is computationally 
lighter than error correction; thus, there is no need of retransmissions. 

We consider a node n E N in the network as in Section [IV] Node n wishes to check the 
validity of the data it forwards. Assume that node n receives M packets per time slot. Recall 
that m is the number of packets in a file and I is the length of each packet, therefore, each 
packet consists of (m + /) symbols. If n detects an error, then it discards that data; otherwise, 
it forwards the data. The probability that n receives a contaminated packet is p n as shown in 
Figure [5] Note that the probability p n of an attack is topology dependent. However, in order to 
compare the performance of various schemes, we use a generic per node model to examine the 
overhead incurred at a node. We assume that there is an external model of vulnerability which 
gives an estimate of p n . Note that the blocking probability \P analyzed in Section [TTTI provides 
such an estimate. 
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M packets arriving 
(potentially 
contaminated) 




Probability of receiving 
any corrupted packet is p 



Only clean 
► packets leaving 



Fig. 5. Diagram of a node n in a network 



A. Overhead analysis of our packet-based signature scheme 

We examine the overhead incurred by our signature scheme. Recall from Section [iVl the file 
size is mllogp bits. The file is divided into m packets, each of which is a vector in ¥ l . Thus, 
the overhead of the RLNC scheme is m/l times the file size, and in practical networks m<C i, 

The initial setup of our signature scheme involves the publishing of the public key, K p , which 
is (m+l) log(g) bits. In typical cryptographic applications, the sizes of p and q are 20 bytes (160 
bits) and 128 bytes (1024 bits), respectively; thus, the size of K p is approximately 6(m + l)/ml 
times the file size. This overhead is negligible as long as 6 <C m <C n. For example, if we have 
a file of size 10MB, divided into m = 100 packets, then the overhead is approximately 6%. 
We note that the public key K p cannot be fully reused for multiple files, as it is possible for 
a malicious node to generate a vector which is not a valid linear combination of the original 
vectors yet satisfies the check d — 1 using information obtained from previously downloaded 
files. We do not provide the details of this for want of space. 

To prevent this from happening, we can redistribute keys for each additional file in one of the 
two methods below. The first method consists of publishing a new public key K p for each file, 
which would incur an overhead of 6(m + l)/ml times the file size. Note that if we republish K p 
for every file, we can reuse the signature x. The second method is to update K p partially and 
generate a new x for each file. This incurs less overhead than the previous method, however, 
requires a high variability in w for it to be secure. This update incurs negligible amount of 
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overhead as well. For example, for a 10MB file, the overhead is less than 0.1%. 

The initial K p distribution costs approximately 6% of our file size, and the incremental update 
of K p and x is much less than 6% if we use the second method. Therefore, we shall denote the 
overhead associated with our signature by o p = ^(m+Z) symbols per packet, i.e. 6% overhead. 

If n detects an error in a packet, then it discards it - by doing so, n can filter out all the 
contaminated packets and use its bandwidth to transmit only valid packets. Therefore, n only 
forwards on average 1 ^ fraction of the data received. 



Our signature scheme costs o p M symbols per time slot. However, by discarding the contami- 
nated packets, node n can on average save its bandwidth by M{m + l)p n symbols per time slot. 
Therefore, the net cost of the signature scheme as a fraction of the total data received is: 

max{0, Mo p — M(m + l)p n )} max{0, o p — (m + l)p n } 



When p n is high, then checking each packet for error saves on bandwidth - i.e. (o p — (m + 
V)Pn) < 0, which shows that the cost of the signature scheme is canceled by the bandwidth 
gained from dropping the corrupted packets. Therefore, this approach is the most sensible when 
the network is unreliable or under heavy attack. 

B. Overhead analysis of end-to-end error correction 

In this subsection, we shall use the rate-optimal error correction codes from Jaggi et al. [15]. 
As long as the attack is within the network capacity, this scheme allows the intermediate nodes 
to transmit at the remaining network capacity, i.e. the end-to-end network capacity minus the 
capacity the adversary can contaminate. In this scenario, node n just naively performs RLNC 
and forwards the data it has received. Therefore, node n transmits on average M(m + l)p n 
contaminated symbols. Thus, the net cost as a fraction of the total data received is: 



m+l 



M{m + 1) 



m + I 



M(m + l)p. 



= Pn- 



(2) 



M{m + /) 
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C. Overhead analysis of generation-based Byzantine detection scheme 

We now analyze the performance of the algorithm proposed by Ho et al. [25], which uses 
random block linear network coding with generation size G (although we have focused on RLNC 
so far, it is possible to extend these results by considering m as the generation size G). This 
scheme is very cheap - with 2% overhead, the detection probability is at least 98.9%. We denote 
the overhead associated with this scheme by o g — y§o( m + l)G symbols per generation. 

After collecting enough packets from the generation, node n checks for possible error in the 
generation, which can incur large delay. If n detects an error, it discards the entire generation 
of G packets; otherwise, it forwards the data. This scheme requires only one hash for the entire 
generation - saving bits on the hashes compared to our signature scheme. However, it can be 
inefficient, as one contaminated packet can cause n to discard an entire generation. 

The probability p g of dropping a generation of G packets is given by: 

p g = 1 — Pr(All G packets are valid) = 1 — (1 — p n ) G . 

The cost and benefit of this scheme includes three components: (i) the hash of o g symbols per 

generation, (ii) valid packets which are discarded if the generation is deemed contaminated, and 

(iii) bandwidth saved by dropping contaminated packets. The expected number of valid symbols 

dropped per generation is p g (l — p n )(m + l)G. The expected number of contaminated symbols 

per generation is p n (m + l)G. Thus, the net cost as a fraction of the total data received is: 

max{0, o g +p g (l- p n )(m + l)G - p n (m + l)G} 

(m + l)G ' 1 } 

For this scheme to work, n needs to receive at least G packets from each generation to decode 
and detect errors. This may seem to indicate that this scheme is only applicable as an end-to-end 
scheme, but it can be extended to a local Byzantine detection scheme as shown in Figure [6J 

The cost of the generation-based scheme increases dramatically with G. If G is large enough, 
the probability of at least one corrupted packet in a generation is high even for small p n . Thus, 
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Fig. 6. Network with non-malicious nodes A, B, C, D, E, and F where node A is transmitting at a total rate of r to node 
F; however, A sends half of its data through B and the other half through C. Therefore, B and C can check the validity of 
the sub-generation they receive, where by sub-generation, we mean a collection of G/2 encoded packets from A. By a similar 
argument, D, E, and F can check the validity of a sub-generation of G/4, G/4, and G packets from A, respectively. 



a large G is undesirable, as almost every generation is found faulty and dropped, making the 

throughput go to zero. This can be verified with an asymptotic analysis of Equation [3] 

max{0,o g + Pg(l-p n )(m + l)G-p n (m + l)G} . , 

hm y - y - — — > max{0, 1 - 2p n \. 

g^oo [m + l)G 

Note in Figure |7] that the cost peaks at p n « 0.2. At p n ps 0.2, the scheme drops many 

generations for a few corrupted packets. Thus, at a moderate rate of attack, the generation-based 

scheme suffers. When p n < 0.2, the generation-based scheme does well, since p n is low and 

the cost of hash is distributed across G packets. As p n increases to 0.5 from 0.2, the throughput 

to the receiver decreases as more generations are dropped. When p > 0.5, this scheme discards 

almost all generations, thus, the expected throughput is near zero. 

D. Trade-offs and comparisons 

In Figures [8] and [9l we compare the three schemes. As mentioned in Section IV-Bl the expected 
cost of error correction scheme is linearly proportional to p n . Therefore, for large p n , this scheme 
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Fig. 7. Ratio between the expected overhead and the total data received by a node for generation-based detection with generation 
size G, packet size 1000 bits, and hash size o g = ( m -|_ [}Q symbols per generation 

performs badly. However, this simple scheme where a node naively forwards all data it receives 
outperforms the detection schemes when p n is low (p n < 0.03). When p n is small, the overhead 
of detection exceeds the cost introduced by the attackers. 

When p n is low, the overhead of our signature is costly, since we are devoting o p symbols per 
packet to detect an unlikely attack. In such a setting, the generation-based scheme performs well, 
as it distributes the cost of the hash (o g symbols) over G packets. However, as p n increases, the 
cost of our signature becomes negligible since the bandwidth wasted by contaminated packets 
increases; thus, our signature scheme outperforms the generation-based scheme. However, it is 
important to note that we underestimate the overhead associated with our signature scheme in 
this paper as we do not take into account the public key distribution cost, which the generation- 
based scheme does not require. Thus, depending on the public key distribution infrastructure 
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- End-to-end error correction 

- Packet-based (o = 6%) 

- Generation-based (G = 2, o = 2%) 

- Generation-based (G = 4, o = 2%) 
Generation-based (G = 10, o =2%) 

Generation-based (G = 20, o = 2%) 

* g ' 

. Generation-based (G = 100, o =2%) 




0.15 0.2 0.25 0.3 0.35 

Probability of error/attack: p 



Fig. 8. Ratio between the expected overhead and the total data received by a node with o v 



Am + l), o g 



used and the frequency of key renewal, our scheme will incur a higher overhead - resulting in 
an outward shift in the overhead in Figure [8j 

We briefly note the computational cost of implementing these schemes. When using our sig- 
nature scheme or the generation-based detection scheme, node n does not waste its bandwidth in 
transmitting contaminated data by dropping a single packet or an entire generation. Furthermore, 
there is no need of retransmission of the dropped data as the receivers can perform erasure 
correction on the packets or the generations that have been dropped. It is important to note that 
for the end-to-end error correction scheme, the receivers need to perform error correction, which 
is computationally more expensive than erasure correction. 
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VI. Conclusions 

In this paper, we studied the problem of Byzantine attacks in network coded P2P networks. 
We used randomly evolving graphs to characterize the impact of Byzantine attackers on the 
receiver's ability to recover a file. As shown by our analysis, even a small number of attackers 
can contaminate most of the flow to the receivers. Motivated by this result, we proposed a novel 
signature scheme for any network using RLNC. The scheme makes use of the linearity of the 
code, and it can be used to easily check the validity of all received packets. Using this scheme, 
we can prevent the intermediate nodes from spreading the contamination by allowing nodes to 
detect contaminated data, drop them, and therefore, only transmit valid data. We emphasize that 
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there is no need of retransmission for the dropped data since the receivers can perform erasure 
correction, which is computationally cheaper than error correction. 

We analyzed the cost and benefit of the signature scheme, and compared it with the end-to- 
end error correction scheme and the generation-based detection scheme. We showed that the 
overhead associated with our scheme is low. Furthermore, when the probability of Byzantine 
attack is high, it is the most bandwidth efficient. However, if the probability of attack is low, 
generation-based Byzantine detection schemes are more appropriate. 
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